%matplotlib inline
from IPython.display import Image
%config InlineBackend.figure_format = 'svg'
# export slides with terminal command:
# jupyter nbconvert run_haats.ipynb --to slides --post serve --reveal-prefix http://cdn.bootcss.com/reveal.js/3.1.0
from import_data import *
from estimation import *
np.random.seed(222)
plt.close("all")
from matplotlib import rc

How do we know when to go long/short Break-Even’s (Nominal Bonds vs ILBs) ?
What is the market-implied inflation rate?
How can we extract the term premium from using a theoretical/structural model?
How can we forecast both nominal and ilb yields in consistent/theoretically robust manner?
How can we extract and forecast market implied discount rates which price these securities?

For example: if Break-Evens (BE) come down quite a bit, what is that? Should we go long or short Break-evens (by trading ILB’s and short Nominal Bonds or using derivatives)?
But wait…. BE = Exp. Inf. + IRP
Suppose there several nominal bonds (N) and several inflation-linked ("real") bonds (R)
The no-arbitrage price of a zero-coupon bond with maturity $\tau$ is:
\begin{eqnarray*} P_{t}^{N}\{\tau\}&=&E_{t}\left[\frac{M_{t+\tau}^{N}}{M_{t}^{N}}\times1\right]=exp\left(-y_{t}^{N}\left\{ \tau\right\} \cdot\tau\right)\\&&\\P_{t}^{R}\{\tau\}&=&E_{t}\left[\frac{M_{t+\tau}^{R}}{M_{t}^{R}}\times1\right]=exp\left(-y_{t}^{R}\left\{ \tau\right\} \cdot\tau\right) \end{eqnarray*}Here, $y_{t}^{N}$ and $y_{t}^{R}$ are the nominal and real yields respectively. $M_{t}^{N}$ and $M_{t}^{N}$ are the nominal and real state price densities.
The SPDs follow
\begin{eqnarray*} \frac{dM_{t}^{R}}{M_{t}^{R}}&=&-r_{t}^{R}dt-\Gamma_{t}\cdot dW_{t}\\&&\\\frac{dM_{t}^{N}}{M_{t}^{N}}&=&-r_{t}^{N}dt-\Gamma_{t}\cdot dW_{t} \end{eqnarray*}For what follows, to simplify the notation, let us suppress the N and R superscripts.
The short rates and risk prices are assumed to be affined in the state variables. This is where we put the rabbit inside the hat...
\begin{eqnarray*} r_{t}&=&\rho_{0}+\rho_{1}\cdot X_{t} \\ \Gamma_{t}&=&\gamma_{0}+\gamma_{1}\cdot X_{t} \end{eqnarray*}Solving for B{t} and G{t} uniquely typically requires imposing some ad-hoc parameter values and other restrictions with little motivation.
Instead, following Christensen, Lopez, Rudebusch (2010) we assume a dynamic Nelson-Seigel (1987) model and impose level, slope and curvature restrictions by replacing the Nelson-Seigel (1987) parameters with the level, slope and curvature state variables.
This is sensible since there is a lot of evidence that Level, Slope, and Curvature (e.g. from PCA) explain the cross-section of government bonds. Furthermore, the data we will use to fit the model will itself be smoothe yields data from Nelson-Seigel-Svennson-type models.
Hence, the second key assumption concerns the state variables:
\begin{eqnarray*} X_{t}&=&\left(\begin{array}{c} L_{t}^{N}\\ S_{t}\\ C_{t}\\ L_{t}^{R} \end{array}\right) \\ dX_{t}&=&K^{P}\left(\theta^{P}-X_{t}\right)dt+\Sigma dW_{t} \end{eqnarray*}This implies
\begin{eqnarray*} y_{t}^{N}\{\tau\} &=& L_{t}^{N}+S_{t}\left(\frac{1-e^{-\lambda\tau}}{\lambda\tau}\right)+C_{t}\left(\frac{1-e^{-\lambda\tau}}{\lambda\tau}-e^{-\lambda\tau}\right)-\frac{G_{t}^{N}\left\{ \tau\right\} }{\tau} \\ y_{t}^{R}\{\tau\} &=& L_{t}^{R}+\alpha^{R}S_{t}\left(\frac{1-e^{-\lambda\tau}}{\lambda\tau}\right)+\alpha^{R}C_{t}\left(\frac{1-e^{-\lambda\tau}}{\lambda\tau}-e^{-\lambda\tau}\right)-\frac{G_{t}^{R}\left\{ \tau\right\} }{\tau} \end{eqnarray*}We observe the Nominal bond and ILB yields but the state variables (X) are hidden. We also need to estimate the model parameters.
First, we can re-write the key equations of the model...
Measurements: $$y_{t}=A_{0}+A_{1}X_{t}+\epsilon_{t} \qquad \text{with }\epsilon_{t}\sim N(0,\Phi)$$
States: $$X_{t}=U_0+U_1 X_{t_-1}+\eta_{t} \qquad \text{with } \eta_{t}\sim N(0,Q)$$
Since the Brownian Motion increments are Gaussian, Kalman-filtering is an efficient and consistent estimator. It also allows for asynchronous variables (crucial: if we later include observed Macro variables as state variables).
We use the Kalman filter along with MLE to jointly extract the hidden states and estimate the model parameters.
Importing and cleaning up (Nelson Siegel smoothed/fitted) yield data for TIPS and Nominal bonds...
Data source:
http://www.federalreserve.gov/econresdata/researchdata/feds200628.xls
https://www.federalreserve.gov/econresdata/researchdata/feds200805.xls
# Import data from FED and doing some cleaning
tips_data, nominal_data = ImportData.importUS_Data(plots=1,save=1)
fig, ax = plt.subplots(1)
figures = {'fig1': fig, 'ax_fig1': ax}
nominal_data.plot(ax=figures['ax_fig1'],figsize=(8,8))
plt.legend(loc='center left',fontsize=9,frameon=0, bbox_to_anchor=(1, 0.5))
figures['ax_fig1'].set_title('US Nominal Bonds')
plt.show()
fig, ax = plt.subplots(1)
figures = {'fig2': fig, 'ax_fig2': ax}
tips_data.plot(ax=figures['ax_fig2'],figsize=(8,8))
plt.legend(loc='center left',fontsize=9,frameon=0, bbox_to_anchor=(1, 0.5))
figures['ax_fig2'].set_title('US InfLinkBonds Bonds')
plt.axes.labelcolor='black'
plt.show()
Measurements: $$y_{t}=A_{0}+A_{1}X_{t}+\epsilon_{t} \qquad \text{with }\epsilon_{t}\sim N(0,\Phi)$$
States: $$X_{t}=U_0+U_1 X_{t_-1}+\eta_{t} \qquad \text{with } \eta_{t}\sim N(0,Q)$$
Assign model parameters
T_=100 #number of dates
m=14 #number of bonds
s = 4 #number of states
A0, A1, U0, U1, Q, Phi = np.mat(np.random.randn(m, 1)), np.mat(np.random.randn(m, s)), \
np.mat(np.random.randn(s, 1)), np.mat(np.diag(np.diag(np.random.rand(s, s)))), \
np.mat(np.diag(np.diag(np.random.rand(s, s)))), np.mat(np.diag(np.diag(np.random.rand(m, m))))
Simulate state variables (X) and the measurement (Y)
X0 = np.mat(np.random.randn(s,1))
X = np.mat(np.empty((T_,s))*np.nan)
Y = np.mat(np.empty((T_,m))*np.nan)
for t in range(T_):
if t==0:
X[t,:] = (U0+U1*X0+Q*np.mat(np.random.randn(s,1))).T
else:
X[t,:] = (U0+U1*X[t-1,:].T +Q*np.mat(np.random.randn(4,1))).T
Y[t,:] = (A0+A1*X[t,:].T+Phi*np.mat(np.random.randn(m,1))).T
X_df= pd.DataFrame(np.array(X),
columns=['LN', 'S', 'C', 'LR'], index=pd.date_range('2000-01-01', periods=X.shape[0]))
Y_df= pd.DataFrame(np.array(Y),
columns=['yield_'+str(i) for i in range(m)], index=pd.date_range('2000-01-01', periods=X.shape[0]))
plt.rc('text', usetex=False)
fig, ax = plt.subplots(1)
figures = {'fig2': fig, 'ax_fig2': ax}
Y_df.plot(ax=figures['ax_fig2'],figsize=(8,8),linewidth=1)
plt.legend(loc='center left',fontsize=12,frameon=0, bbox_to_anchor=(1, 0.5))
figures['ax_fig2'].set_title('Measurements')
plt.show()
fig, ax = plt.subplots(1)
figures = {'fig2': fig, 'ax_fig2': ax}
X_df.plot(ax=figures['ax_fig2'],figsize=(8,8),linewidth=1)
plt.legend(loc='center left',fontsize=12,frameon=0, bbox_to_anchor=(1, 0.5))
figures['ax_fig2'].set_title('Hidden States')
plt.axes.labelcolor='black'
plt.show()
kalman = Kalman(Y_df, A0, A1, U0, U1, Q, Phi,statevar_names = X_df.columns.values)
Ytt_filtered, Yttl_filtered, Xtt_filtered, Xttl_filtered, Vtt, Vttl, Gain_t, eta_t_filtered, cum_log_likelihood = kalman.filter()
linestyles = ['-', '--', '-.', ':','-', '--', '-.', ':','-', '--', '-.', ':','-', '--', '-.', ':']
plt.rc('text', usetex=True)
fig, ax = plt.subplots(1)
figures = {'fig2': fig, 'ax_fig2': ax}
pd.concat(
[X_df,
Xtt_filtered.rename(columns={i:i+'$_{k|k}$' for i in Xtt_filtered.columns.values},inplace=False),
Xttl_filtered.rename(columns={i:i+'$_{k|k-1}$' for i in Xtt_filtered.columns.values},inplace=False)]
, axis=1).plot(ax=figures['ax_fig2'],figsize=(8,8),style=linestyles,linewidth=1)
plt.legend(loc='center left',fontsize=12,frameon=0, bbox_to_anchor=(1, 0.5))
figures['ax_fig2'].set_title('Hidden States and Filtered States')
plt.axes.labelcolor='black'
plt.show()
linestyles = ['-', '--', '-.', ':','-', '--', '-.', ':','-', '--', '-.', ':','-', '--', '-.', ':']
plt.rc('text', usetex=True)
fig, ax = plt.subplots(1)
figures = {'fig2': fig, 'ax_fig2': ax}
pd.concat(
[Y_df.iloc[:,0:4].rename(columns={i:str.replace(i,'_','\_') for i in Y_df.columns.values},inplace=False),
Ytt_filtered.iloc[:,0:4].rename(columns={i:str.replace(i,'_','\_')+'$_{k|k}$' for i in Ytt_filtered.columns.values},inplace=False),
Yttl_filtered.iloc[:,0:4].rename(columns={i:str.replace(i,'_','\_')+'$_{k|k-1}$' for i in Yttl_filtered.columns.values},inplace=False)]
, axis=1).plot(ax=figures['ax_fig2'],figsize=(8,8),style=linestyles,linewidth=1)
plt.legend(loc='center left',fontsize=12,frameon=0, bbox_to_anchor=(1, 0.5))
figures['ax_fig2'].set_title('Measurements and Filtered Measurements')
plt.axes.labelcolor='black'
plt.show()
forecast, forecast_std, forecast_cov = kalman.forecast(Xtt_filtered, horizon=10)
#import seaborn as sns
import seaborn.apionly as sns #use sns.distplot but maintain the default matplotlib styling
sns.set("talk", font_scale=1, rc={"lines.linewidth": 1,"axes.labelcolor":'black',"text.color":'black'})
fig, ax = sns.plt.subplots(1)
line,=sns.plt.plot(forecast[['yield_5']].iloc[np.arange(0, forecast.shape[0], 10+1)].set_index(\
forecast[['yield_5']].iloc[np.arange(0, forecast.shape[0], 10+1)].index.get_level_values('date')),linewidth=1.5\
#,linestyle='solid', marker='o', markerfacecolor='blue', markersize=3.5
)
line.set_label('yield_5')
for t in forecast.index.get_level_values('date').unique():
line,=sns.plt.plot(
forecast[['yield_5']].iloc[forecast.index.get_level_values('date')==t,:].set_index(
forecast[['yield_5']].iloc[forecast.index.get_level_values('date')==t,:].index.get_level_values('horizon'))
,color='red',linewidth=0.5)
line.set_label('forecasts')
sns.plt.legend()
sns.plt.axes.labelcolor='black'
sns.plt.show()
fig, ax = sns.plt.subplots(1,figsize=(14,8))
line,=sns.plt.plot(forecast[['yield_5']].iloc[np.arange(0, forecast.shape[0], 10+1)].set_index(\
forecast[['yield_5']].iloc[np.arange(0, forecast.shape[0], 10+1)].index.get_level_values('date')),linewidth=1.5\
#,linestyle='solid', marker='o', markerfacecolor='blue', markersize=3.5
)
line.set_label('yield_5: actual')
for t in forecast.index.get_level_values('date').unique():
line,=sns.plt.plot(
forecast[['yield_5']].iloc[forecast.index.get_level_values('date')==t,:].set_index(
forecast[['yield_5']].iloc[forecast.index.get_level_values('date')==t,:].index.get_level_values('horizon'))
,color='red',linewidth=0.5)
fill=plt.fill_between(forecast[['yield_5']].iloc[forecast.index.get_level_values('date')==t,:].index.get_level_values('horizon').values
,
forecast[['yield_5']].iloc[forecast.index.get_level_values('date')==t,:].values[:,0]-
forecast_std[['yield_5']].iloc[forecast.index.get_level_values('date')==t,:].values[:,0],
forecast[['yield_5']].iloc[forecast.index.get_level_values('date')==t,:].values[:,0]+
forecast_std[['yield_5']].iloc[forecast.index.get_level_values('date')==t,:].values[:,0]
, facecolor='red', interpolate=True, alpha=.05
)
line.set_label('yield_5: forecasts')
fill.set_label('yield_5: forecasts stderr')
sns.plt.legend(loc='best')
sns.plt.axes.labelcolor='black'
sns.plt.show()
forecast_e, forecast_se, forecast_mse, forecast_rmse, forecast_mse_all, forecast_rmse_all = \
kalman.rmse(forecast)
plt.rc('text', usetex=False)
fig, ax = plt.subplots(1)
figures = {'fig2': fig, 'ax_fig2': ax}
forecast_rmse.plot(ax=figures['ax_fig2'],figsize=(8,8),linewidth=1)
plt.legend(loc='center left',fontsize=12,frameon=0, bbox_to_anchor=(1, 0.5))
figures['ax_fig2'].set_title('RMSE')
plt.axes.labelcolor='black'
plt.show()
start_time = time.time()
# The interveal between each rolling window: the gap by which the estimationd window shifts
# (e.g. with tgap = 1, rolling window is updated daily)
tgap = 30
# Rolling window: 0 if using expanding window, 1 if using rolling window
rolling = 0
#Rolling window size: size of rolling window (in years).. Use inf for full sample estimation
windowsize = np.inf;
np.set_printoptions(precision=32, suppress=True) #increase precision on numeric values
################################################
# PRIMITIVES:
figures = []
# use allow_missing_data= 1 to extract ILB and Nominal dates where both are non-missing
allow_missing_data = 0
# set frequency of the data: daily, monthly, quarterly, yearly
estim_freq = 'weekly'
fix_Phi = 1 # "1" if you want to fix the volatility of observed yields using covar of historical data
# "0" if you want to jointly estimate it with other model parameters
setdiag_Kp = 1 # "1" if you want to Kp to be diagonal so the state variables are assumed independent
# "0" if you want to Kp to be unrestricted
# options for initializing the Kalman filter error variance:
#'steady_state' or 'unconditional' or 'identity' matrix
initV = 'unconditional'
# number of hidden state variables 4, or 6
num_states = 4
# Specify the maturities of data we want to use
US_ilbmaturities = np.array([2, 3, 5, 6, 8, 9, 10])
US_nominalmaturities = np.array([2, 3, 5, 6, 8, 9, 10])
US_maturities = np.hstack((US_nominalmaturities, US_ilbmaturities))
############################################################
# Set start and end dates for estimation
sdate, edate = '2004-01-01', '2015-11-23'#time.strftime("%Y-%m-%d") #'2010-01-01'
print("start date: %s" % sdate)
print("end date: %s" % edate)
# extract data for desired maturities and dates
tips_data, nominal_data = ImportData.importUS_Data(US_ilbmaturities, US_nominalmaturities,plots=0,save=1)
data = ImportData.extract_subset(tips_data, nominal_data, sdate, edate, allow_missing_data, estim_freq)
estimation =Rolling()
estimation.run(data, US_ilbmaturities, US_nominalmaturities, \
estim_freq=estim_freq, num_states=num_states,\
fix_Phi=fix_Phi, setdiag_Kp=setdiag_Kp, initV=initV)
end_time = time.time()
print("--- %s seconds ---" % (end_time - start_time))
Detailed documentation/appendix can be found at https://github.com/GinoAndTonic/ssylvain_public/blob/master/research/haats/haats_documentation.lyx https://github.com/GinoAndTonic/ssylvain_public/blob/master/research/haats/haats_documentation.pdf
My Python code and be forked from https://github.com/GinoAndTonic/ssylvain_public/blob/master/research/haats